Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 300190 |
| Missing cells | 49201 |
| Missing cells (%) | 1.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 32.1 MiB |
| Average record size in memory | 112.0 B |
Variable types
| Categorical | 4 |
|---|---|
| DateTime | 1 |
| Numeric | 9 |
VERSIE has constant value "1.0" | Constant |
DATUM_BESTAND has constant value "2022-06-21" | Constant |
PEILDATUM has constant value "2022-06-01" | Constant |
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1818 distinct values | High cardinality |
BEHANDELEND_SPECIALISME_CD is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with BEHANDELEND_SPECIALISME_CD and 1 other fields | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
DATUM_BESTAND is highly correlated with PEILDATUM and 1 other fields | High correlation |
PEILDATUM is highly correlated with DATUM_BESTAND and 1 other fields | High correlation |
VERSIE is highly correlated with DATUM_BESTAND and 1 other fields | High correlation |
JAAR is highly correlated with AANTAL_PAT_PER_SPC and 1 other fields | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with JAAR and 1 other fields | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with JAAR and 1 other fields | High correlation |
GEMIDDELDE_VERKOOPPRIJS has 49201 (16.4%) missing values | Missing |
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.16688519) | Skewed |
Reproduction
| Analysis started | 2022-07-08 10:11:57.220146 |
|---|---|
| Analysis finished | 2022-07-08 10:12:20.475652 |
| Duration | 23.26 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| 1.0 |
|---|
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 900570 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 1.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 300190 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1.0 | 300190 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 300190 | |
| . | 300190 | |
| 0 | 300190 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 600380 | |
| Other Punctuation | 300190 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 300190 | |
| 0 | 300190 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 300190 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 900570 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 300190 | |
| . | 300190 | |
| 0 | 300190 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 900570 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 300190 | |
| . | 300190 | |
| 0 | 300190 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| 2022-06-21 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 3001900 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022-06-21 |
|---|---|
| 2nd row | 2022-06-21 |
| 3rd row | 2022-06-21 |
| 4th row | 2022-06-21 |
| 5th row | 2022-06-21 |
Common Values
| Value | Count | Frequency (%) |
| 2022-06-21 | 300190 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2022-06-21 | 300190 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 1200760 | |
| 0 | 600380 | |
| - | 600380 | |
| 6 | 300190 | 10.0% |
| 1 | 300190 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2401520 | |
| Dash Punctuation | 600380 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 1200760 | |
| 0 | 600380 | |
| 6 | 300190 | 12.5% |
| 1 | 300190 | 12.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 600380 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3001900 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 1200760 | |
| 0 | 600380 | |
| - | 600380 | |
| 6 | 300190 | 10.0% |
| 1 | 300190 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3001900 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 1200760 | |
| 0 | 600380 | |
| - | 600380 | |
| 6 | 300190 | 10.0% |
| 1 | 300190 | 10.0% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| 2022-06-01 |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 3001900 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2022-06-01 |
|---|---|
| 2nd row | 2022-06-01 |
| 3rd row | 2022-06-01 |
| 4th row | 2022-06-01 |
| 5th row | 2022-06-01 |
Common Values
| Value | Count | Frequency (%) |
| 2022-06-01 | 300190 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| 2022-06-01 | 300190 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 900570 | |
| 0 | 900570 | |
| - | 600380 | |
| 6 | 300190 | 10.0% |
| 1 | 300190 | 10.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2401520 | |
| Dash Punctuation | 600380 | 20.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 900570 | |
| 0 | 900570 | |
| 6 | 300190 | 12.5% |
| 1 | 300190 | 12.5% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 600380 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3001900 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 900570 | |
| 0 | 900570 | |
| - | 600380 | |
| 6 | 300190 | 10.0% |
| 1 | 300190 | 10.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3001900 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 900570 | |
| 0 | 900570 | |
| - | 600380 | |
| 6 | 300190 | 10.0% |
| 1 | 300190 | 10.0% |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| Minimum | 2012-01-01 00:00:00 |
|---|---|
| Maximum | 2022-01-01 00:00:00 |
| Distinct | 28 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 430.6566208 |
| Minimum | 301 |
|---|---|
| Maximum | 8418 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 301 |
|---|---|
| 5-th percentile | 302 |
| Q1 | 305 |
| median | 313 |
| Q3 | 322 |
| 95-th percentile | 335 |
| Maximum | 8418 |
| Range | 8117 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 958.8150454 |
|---|---|
| Coefficient of variation (CV) | 2.226402658 |
| Kurtosis | 65.28052091 |
| Mean | 430.6566208 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 8.196516408 |
| Sum | 129278811 |
| Variance | 919326.2914 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 305 | 42433 | |
| 313 | 38971 | |
| 303 | 34672 | |
| 330 | 23836 | 7.9% |
| 316 | 20357 | 6.8% |
| 308 | 15859 | 5.3% |
| 306 | 12562 | 4.2% |
| 324 | 12270 | 4.1% |
| 301 | 12187 | 4.1% |
| 304 | 9758 | 3.3% |
| Other values (18) | 77285 |
| Value | Count | Frequency (%) |
| 301 | 12187 | 4.1% |
| 302 | 6609 | 2.2% |
| 303 | 34672 | |
| 304 | 9758 | 3.3% |
| 305 | 42433 | |
| 306 | 12562 | 4.2% |
| 307 | 5262 | 1.8% |
| 308 | 15859 | 5.3% |
| 310 | 3380 | 1.1% |
| 313 | 38971 |
| Value | Count | Frequency (%) |
| 8418 | 4133 | 1.4% |
| 8416 | 123 | < 0.1% |
| 1900 | 193 | 0.1% |
| 390 | 824 | 0.3% |
| 389 | 3208 | 1.1% |
| 362 | 4143 | 1.4% |
| 361 | 2152 | 0.7% |
| 335 | 3033 | 1.0% |
| 330 | 23836 | |
| 329 | 796 | 0.3% |
| Distinct | 1818 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| 101 | 1283 |
|---|---|
| 402 | 1248 |
| 403 | 1208 |
| 301 | 1207 |
| 201 | 1145 |
| Other values (1813) |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.349488657 |
| Min length | 2 |
Characters and Unicode
| Total characters | 1005483 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 29 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 07 |
|---|---|
| 2nd row | 15 |
| 3rd row | 20 |
| 4th row | 12 |
| 5th row | 20 |
Common Values
| Value | Count | Frequency (%) |
| 101 | 1283 | 0.4% |
| 402 | 1248 | 0.4% |
| 403 | 1208 | 0.4% |
| 301 | 1207 | 0.4% |
| 201 | 1145 | 0.4% |
| 203 | 1141 | 0.4% |
| 401 | 1020 | 0.3% |
| 404 | 1009 | 0.3% |
| 409 | 982 | 0.3% |
| 802 | 972 | 0.3% |
| Other values (1808) | 288975 |
Length
| Value | Count | Frequency (%) |
| 101 | 1283 | 0.4% |
| 402 | 1248 | 0.4% |
| 403 | 1208 | 0.4% |
| 301 | 1207 | 0.4% |
| 201 | 1145 | 0.4% |
| 203 | 1141 | 0.4% |
| 401 | 1020 | 0.3% |
| 404 | 1009 | 0.3% |
| 409 | 982 | 0.3% |
| 802 | 972 | 0.3% |
| Other values (1808) | 288975 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 192628 | |
| 0 | 184065 | |
| 2 | 133257 | |
| 3 | 109062 | |
| 5 | 77373 | |
| 9 | 72579 | 7.2% |
| 4 | 71568 | 7.1% |
| 7 | 59153 | 5.9% |
| 6 | 52397 | 5.2% |
| 8 | 43259 | 4.3% |
| Other values (15) | 10142 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 995341 | |
| Uppercase Letter | 10142 | 1.0% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| G | 1938 | |
| M | 1693 | |
| B | 1213 | |
| E | 851 | |
| Z | 828 | |
| D | 678 | 6.7% |
| A | 651 | 6.4% |
| F | 646 | 6.4% |
| C | 333 | 3.3% |
| K | 329 | 3.2% |
| Other values (5) | 982 |
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 192628 | |
| 0 | 184065 | |
| 2 | 133257 | |
| 3 | 109062 | |
| 5 | 77373 | |
| 9 | 72579 | 7.3% |
| 4 | 71568 | 7.2% |
| 7 | 59153 | 5.9% |
| 6 | 52397 | 5.3% |
| 8 | 43259 | 4.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 995341 | |
| Latin | 10142 | 1.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| G | 1938 | |
| M | 1693 | |
| B | 1213 | |
| E | 851 | |
| Z | 828 | |
| D | 678 | 6.7% |
| A | 651 | 6.4% |
| F | 646 | 6.4% |
| C | 333 | 3.3% |
| K | 329 | 3.2% |
| Other values (5) | 982 |
Common
| Value | Count | Frequency (%) |
| 1 | 192628 | |
| 0 | 184065 | |
| 2 | 133257 | |
| 3 | 109062 | |
| 5 | 77373 | |
| 9 | 72579 | 7.3% |
| 4 | 71568 | 7.2% |
| 7 | 59153 | 5.9% |
| 6 | 52397 | 5.3% |
| 8 | 43259 | 4.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1005483 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 192628 | |
| 0 | 184065 | |
| 2 | 133257 | |
| 3 | 109062 | |
| 5 | 77373 | |
| 9 | 72579 | 7.2% |
| 4 | 71568 | 7.1% |
| 7 | 59153 | 5.9% |
| 6 | 52397 | 5.2% |
| 8 | 43259 | 4.3% |
| Other values (15) | 10142 | 1.0% |
ZORGPRODUCT_CD
Real number (ℝ≥0)
| Distinct | 5968 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 439816122.5 |
| Minimum | 10501002 |
|---|---|
| Maximum | 998418081 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 10501002 |
|---|---|
| 5-th percentile | 28999037 |
| Q1 | 99799034 |
| median | 149599024 |
| Q3 | 990004004 |
| 95-th percentile | 990516035.5 |
| Maximum | 998418081 |
| Range | 987917079 |
| Interquartile range (IQR) | 890204970 |
Descriptive statistics
| Standard deviation | 428802968.6 |
|---|---|
| Coefficient of variation (CV) | 0.9749596403 |
| Kurtosis | -1.732758159 |
| Mean | 439816122.5 |
| Median Absolute Deviation (MAD) | 119600018 |
| Skewness | 0.4723035305 |
| Sum | 1.320284018 × 1014 |
| Variance | 1.838719859 × 1017 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 990004009 | 2189 | 0.7% |
| 990004007 | 2150 | 0.7% |
| 990003004 | 2102 | 0.7% |
| 990004006 | 1714 | 0.6% |
| 990356076 | 1555 | 0.5% |
| 990356073 | 1432 | 0.5% |
| 131999228 | 1381 | 0.5% |
| 990003007 | 1358 | 0.5% |
| 131999164 | 1355 | 0.5% |
| 199299013 | 1263 | 0.4% |
| Other values (5958) | 283691 |
| Value | Count | Frequency (%) |
| 10501002 | 8 | |
| 10501003 | 11 | |
| 10501004 | 11 | |
| 10501005 | 11 | |
| 10501007 | 3 | < 0.1% |
| 10501008 | 11 | |
| 10501010 | 11 | |
| 10501011 | 3 | < 0.1% |
| 11101002 | 9 | |
| 11101003 | 11 |
| Value | Count | Frequency (%) |
| 998418081 | 148 | |
| 998418080 | 133 | |
| 998418079 | 37 | < 0.1% |
| 998418077 | 8 | < 0.1% |
| 998418076 | 8 | < 0.1% |
| 998418075 | 6 | < 0.1% |
| 998418074 | 204 | |
| 998418073 | 192 | |
| 998418072 | 8 | < 0.1% |
| 998418071 | 8 | < 0.1% |
AANTAL_PAT_PER_ZPD
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 9770 |
|---|---|
| Distinct (%) | 3.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 514.3071755 |
| Minimum | 1 |
|---|---|
| Maximum | 164654 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 13 |
| Q3 | 101 |
| 95-th percentile | 1738 |
| Maximum | 164654 |
| Range | 164653 |
| Interquartile range (IQR) | 98 |
Descriptive statistics
| Standard deviation | 3186.890378 |
|---|---|
| Coefficient of variation (CV) | 6.196472712 |
| Kurtosis | 396.1231592 |
| Mean | 514.3071755 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 16.54114341 |
| Sum | 154389871 |
| Variance | 10156270.28 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 50459 | 16.8% |
| 2 | 24552 | 8.2% |
| 3 | 15997 | 5.3% |
| 4 | 11675 | 3.9% |
| 5 | 9128 | 3.0% |
| 6 | 7690 | 2.6% |
| 7 | 6449 | 2.1% |
| 8 | 5435 | 1.8% |
| 9 | 4945 | 1.6% |
| 10 | 4364 | 1.5% |
| Other values (9760) | 159496 |
| Value | Count | Frequency (%) |
| 1 | 50459 | |
| 2 | 24552 | |
| 3 | 15997 | 5.3% |
| 4 | 11675 | 3.9% |
| 5 | 9128 | 3.0% |
| 6 | 7690 | 2.6% |
| 7 | 6449 | 2.1% |
| 8 | 5435 | 1.8% |
| 9 | 4945 | 1.6% |
| 10 | 4364 | 1.5% |
| Value | Count | Frequency (%) |
| 164654 | 1 | |
| 155884 | 1 | |
| 154270 | 1 | |
| 151516 | 1 | |
| 144725 | 1 | |
| 137459 | 1 | |
| 118039 | 1 | |
| 115941 | 1 | |
| 110520 | 1 | |
| 109675 | 1 |
AANTAL_SUBTRAJECT_PER_ZPD
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONSKEWED| Distinct | 10484 |
|---|---|
| Distinct (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 606.8086812 |
| Minimum | 1 |
|---|---|
| Maximum | 239919 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 14 |
| Q3 | 110 |
| 95-th percentile | 1981 |
| Maximum | 239919 |
| Range | 239918 |
| Interquartile range (IQR) | 107 |
Descriptive statistics
| Standard deviation | 4085.892752 |
|---|---|
| Coefficient of variation (CV) | 6.73341183 |
| Kurtosis | 712.9122922 |
| Mean | 606.8086812 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 21.16688519 |
| Sum | 182157898 |
| Variance | 16694519.58 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 48618 | 16.2% |
| 2 | 24142 | 8.0% |
| 3 | 15832 | 5.3% |
| 4 | 11461 | 3.8% |
| 5 | 9058 | 3.0% |
| 6 | 7680 | 2.6% |
| 7 | 6402 | 2.1% |
| 8 | 5388 | 1.8% |
| 9 | 4890 | 1.6% |
| 10 | 4357 | 1.5% |
| Other values (10474) | 162362 |
| Value | Count | Frequency (%) |
| 1 | 48618 | |
| 2 | 24142 | |
| 3 | 15832 | 5.3% |
| 4 | 11461 | 3.8% |
| 5 | 9058 | 3.0% |
| 6 | 7680 | 2.6% |
| 7 | 6402 | 2.1% |
| 8 | 5388 | 1.8% |
| 9 | 4890 | 1.6% |
| 10 | 4357 | 1.5% |
| Value | Count | Frequency (%) |
| 239919 | 1 | |
| 232431 | 1 | |
| 232118 | 1 | |
| 228048 | 1 | |
| 227606 | 1 | |
| 226825 | 1 | |
| 224099 | 1 | |
| 218623 | 1 | |
| 214231 | 1 | |
| 209056 | 1 |
AANTAL_PAT_PER_DIAG
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 8609 |
|---|---|
| Distinct (%) | 2.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7716.614947 |
| Minimum | 1 |
|---|---|
| Maximum | 227540 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 33 |
| Q1 | 380 |
| median | 1689 |
| Q3 | 6363 |
| 95-th percentile | 37157 |
| Maximum | 227540 |
| Range | 227539 |
| Interquartile range (IQR) | 5983 |
Descriptive statistics
| Standard deviation | 17946.67857 |
|---|---|
| Coefficient of variation (CV) | 2.325719074 |
| Kurtosis | 33.45850561 |
| Mean | 7716.614947 |
| Median Absolute Deviation (MAD) | 1554 |
| Skewness | 5.024953979 |
| Sum | 2316450641 |
| Variance | 322083271.7 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 592 | 0.2% |
| 9 | 591 | 0.2% |
| 2 | 583 | 0.2% |
| 1 | 550 | 0.2% |
| 3 | 543 | 0.2% |
| 12 | 534 | 0.2% |
| 21 | 532 | 0.2% |
| 14 | 526 | 0.2% |
| 6 | 524 | 0.2% |
| 8 | 519 | 0.2% |
| Other values (8599) | 294696 |
| Value | Count | Frequency (%) |
| 1 | 550 | |
| 2 | 583 | |
| 3 | 543 | |
| 4 | 592 | |
| 5 | 495 | |
| 6 | 524 | |
| 7 | 502 | |
| 8 | 519 | |
| 9 | 591 | |
| 10 | 445 |
| Value | Count | Frequency (%) |
| 227540 | 23 | |
| 213989 | 24 | |
| 213752 | 17 | |
| 213538 | 25 | |
| 211597 | 17 | |
| 210437 | 19 | |
| 205349 | 17 | |
| 203859 | 23 | |
| 200604 | 16 | |
| 198530 | 20 |
AANTAL_SUBTRAJECT_PER_DIAG
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 9538 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 11088.68809 |
| Minimum | 1 |
|---|---|
| Maximum | 368506 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 42 |
| Q1 | 503 |
| median | 2355 |
| Q3 | 9075.25 |
| 95-th percentile | 52490 |
| Maximum | 368506 |
| Range | 368505 |
| Interquartile range (IQR) | 8572.25 |
Descriptive statistics
| Standard deviation | 26613.53588 |
|---|---|
| Coefficient of variation (CV) | 2.400061726 |
| Kurtosis | 37.01063881 |
| Mean | 11088.68809 |
| Median Absolute Deviation (MAD) | 2184 |
| Skewness | 5.267910555 |
| Sum | 3328713278 |
| Variance | 708280292.1 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 3 | 515 | 0.2% |
| 2 | 500 | 0.2% |
| 4 | 485 | 0.2% |
| 1 | 471 | 0.2% |
| 5 | 453 | 0.2% |
| 6 | 451 | 0.2% |
| 10 | 448 | 0.1% |
| 11 | 418 | 0.1% |
| 23 | 413 | 0.1% |
| 12 | 409 | 0.1% |
| Other values (9528) | 295627 |
| Value | Count | Frequency (%) |
| 1 | 471 | |
| 2 | 500 | |
| 3 | 515 | |
| 4 | 485 | |
| 5 | 453 | |
| 6 | 451 | |
| 7 | 394 | |
| 8 | 399 | |
| 9 | 379 | |
| 10 | 448 |
| Value | Count | Frequency (%) |
| 368506 | 23 | |
| 348526 | 25 | |
| 341695 | 19 | |
| 336643 | 24 | |
| 323792 | 20 | |
| 314672 | 17 | |
| 310782 | 17 | |
| 302580 | 23 | |
| 298651 | 17 | |
| 289045 | 16 |
AANTAL_PAT_PER_SPC
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 296 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 673983.9904 |
| Minimum | 13 |
|---|---|
| Maximum | 1487650 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 13 |
|---|---|
| 5-th percentile | 26921 |
| Q1 | 277929 |
| median | 747032 |
| Q3 | 1021519 |
| 95-th percentile | 1336157 |
| Maximum | 1487650 |
| Range | 1487637 |
| Interquartile range (IQR) | 743590 |
Descriptive statistics
| Standard deviation | 419586.1431 |
|---|---|
| Coefficient of variation (CV) | 0.622546157 |
| Kurtosis | -1.11768345 |
| Mean | 673983.9904 |
| Median Absolute Deviation (MAD) | 316556 |
| Skewness | -0.05201073889 |
| Sum | 2.023232541 × 1011 |
| Variance | 1.760525315 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 880959 | 5102 | 1.7% |
| 874172 | 4354 | 1.5% |
| 843990 | 4348 | 1.4% |
| 894394 | 4333 | 1.4% |
| 880551 | 4273 | 1.4% |
| 894928 | 4212 | 1.4% |
| 754549 | 4082 | 1.4% |
| 1084060 | 3890 | 1.3% |
| 1101091 | 3864 | 1.3% |
| 1063588 | 3851 | 1.3% |
| Other values (286) | 257881 |
| Value | Count | Frequency (%) |
| 13 | 3 | < 0.1% |
| 117 | 54 | < 0.1% |
| 125 | 41 | < 0.1% |
| 248 | 58 | < 0.1% |
| 359 | 38 | < 0.1% |
| 393 | 123 | |
| 771 | 159 | |
| 933 | 226 | |
| 1611 | 130 | |
| 1623 | 133 |
| Value | Count | Frequency (%) |
| 1487650 | 2975 | |
| 1450424 | 3048 | |
| 1421817 | 3564 | |
| 1345182 | 3543 | |
| 1336157 | 3439 | |
| 1332850 | 3546 | |
| 1317329 | 3463 | |
| 1283064 | 3577 | |
| 1265255 | 1177 | 0.4% |
| 1262552 | 1201 | 0.4% |
AANTAL_SUBTRAJECT_PER_SPC
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 296 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1085856.523 |
| Minimum | 13 |
|---|---|
| Maximum | 2666840 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 13 |
|---|---|
| 5-th percentile | 39189 |
| Q1 | 483127 |
| median | 1080458 |
| Q3 | 1729091 |
| 95-th percentile | 2557598 |
| Maximum | 2666840 |
| Range | 2666827 |
| Interquartile range (IQR) | 1245964 |
Descriptive statistics
| Standard deviation | 741431.7155 |
|---|---|
| Coefficient of variation (CV) | 0.6828081796 |
| Kurtosis | -0.8569613786 |
| Mean | 1085856.523 |
| Median Absolute Deviation (MAD) | 648633 |
| Skewness | 0.2941317302 |
| Sum | 3.259632695 × 1011 |
| Variance | 5.497209888 × 1011 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1211813 | 5102 | 1.7% |
| 1281617 | 4354 | 1.5% |
| 1216294 | 4348 | 1.4% |
| 1315716 | 4333 | 1.4% |
| 1300626 | 4273 | 1.4% |
| 1336490 | 4212 | 1.4% |
| 1136773 | 4082 | 1.4% |
| 2557598 | 3890 | 1.3% |
| 2666840 | 3864 | 1.3% |
| 2488271 | 3851 | 1.3% |
| Other values (286) | 257881 |
| Value | Count | Frequency (%) |
| 13 | 3 | < 0.1% |
| 117 | 54 | < 0.1% |
| 126 | 41 | < 0.1% |
| 248 | 58 | < 0.1% |
| 367 | 38 | < 0.1% |
| 397 | 123 | |
| 783 | 159 | |
| 1010 | 226 | |
| 1781 | 71 | < 0.1% |
| 1855 | 133 |
| Value | Count | Frequency (%) |
| 2666840 | 3864 | |
| 2603380 | 3845 | |
| 2578573 | 3769 | |
| 2557598 | 3890 | |
| 2488271 | 3851 | |
| 2184164 | 3757 | |
| 2178821 | 3635 | |
| 2066228 | 3810 | |
| 2045007 | 1169 | 0.4% |
| 1990307 | 1167 | 0.4% |
| Distinct | 3395 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 49201 |
| Missing (%) | 16.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3554.785528 |
| Minimum | 70 |
|---|---|
| Maximum | 287220 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 2.3 MiB |
Quantile statistics
| Minimum | 70 |
|---|---|
| 5-th percentile | 140 |
| Q1 | 470 |
| median | 1250 |
| Q3 | 4135 |
| 95-th percentile | 13425 |
| Maximum | 287220 |
| Range | 287150 |
| Interquartile range (IQR) | 3665 |
Descriptive statistics
| Standard deviation | 6543.689664 |
|---|---|
| Coefficient of variation (CV) | 1.840811383 |
| Kurtosis | 155.0190446 |
| Mean | 3554.785528 |
| Median Absolute Deviation (MAD) | 1020 |
| Skewness | 7.434401821 |
| Sum | 892212065 |
| Variance | 42819874.42 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 160 | 1845 | 0.6% |
| 105 | 1837 | 0.6% |
| 110 | 1779 | 0.6% |
| 180 | 1485 | 0.5% |
| 145 | 1425 | 0.5% |
| 300 | 1323 | 0.4% |
| 165 | 1258 | 0.4% |
| 125 | 1258 | 0.4% |
| 185 | 1249 | 0.4% |
| 140 | 1233 | 0.4% |
| Other values (3385) | 236297 | |
| (Missing) | 49201 | 16.4% |
| Value | Count | Frequency (%) |
| 70 | 227 | 0.1% |
| 75 | 75 | < 0.1% |
| 80 | 362 | 0.1% |
| 85 | 917 | |
| 90 | 677 | 0.2% |
| 95 | 665 | 0.2% |
| 100 | 896 | |
| 105 | 1837 | |
| 110 | 1779 | |
| 115 | 897 |
| Value | Count | Frequency (%) |
| 287220 | 8 | |
| 148910 | 3 | < 0.1% |
| 142835 | 4 | |
| 122155 | 4 | |
| 116765 | 3 | < 0.1% |
| 109725 | 7 | |
| 108570 | 7 | |
| 107655 | 4 | |
| 101270 | 8 | |
| 95465 | 7 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 07 | 990029010 | 227 | 234 | 1439 | 1500 | 21988 | 24177 | 1345.0 |
| 1 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 15 | 990029002 | 149 | 150 | 1029 | 1075 | 21988 | 24177 | 205.0 |
| 2 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 20 | 990029011 | 2 | 2 | 4 | 4 | 21988 | 24177 | 545.0 |
| 3 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 12 | 990029010 | 14 | 16 | 114 | 120 | 21988 | 24177 | 1345.0 |
| 4 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 20 | 990029010 | 2 | 2 | 4 | 4 | 21988 | 24177 | 1345.0 |
| 5 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 01 | 990029010 | 53 | 54 | 331 | 348 | 21988 | 24177 | 1345.0 |
| 6 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 03 | 990029011 | 375 | 379 | 847 | 863 | 21988 | 24177 | 545.0 |
| 7 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 17 | 990029010 | 73 | 73 | 695 | 704 | 21988 | 24177 | 1345.0 |
| 8 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 05 | 990029010 | 256 | 261 | 1284 | 1345 | 21988 | 24177 | 1345.0 |
| 9 | 1.0 | 2022-06-21 | 2022-06-01 | 2018-01-01 | 329 | 02 | 990029011 | 1375 | 1386 | 4872 | 4977 | 21988 | 24177 | 545.0 |
Last rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 300180 | 1.0 | 2022-06-21 | 2022-06-01 | 2014-01-01 | 313 | 779 | 979003009 | 2 | 3 | 777 | 1239 | 1038891 | 2066228 | 22695.0 |
| 300181 | 1.0 | 2022-06-21 | 2022-06-01 | 2016-01-01 | 303 | 238 | 199299064 | 1 | 1 | 12559 | 13887 | 1332850 | 1833711 | 6200.0 |
| 300182 | 1.0 | 2022-06-21 | 2022-06-01 | 2016-01-01 | 303 | 825 | 179799015 | 3 | 3 | 12 | 17 | 1332850 | 1833711 | 135.0 |
| 300183 | 1.0 | 2022-06-21 | 2022-06-01 | 2016-01-01 | 303 | 402 | 972802098 | 1 | 1 | 6177 | 8116 | 1332850 | 1833711 | 3935.0 |
| 300184 | 1.0 | 2022-06-21 | 2022-06-01 | 2015-01-01 | 322 | 1303 | 972802093 | 1 | 1 | 25739 | 72851 | 442207 | 759969 | 11610.0 |
| 300185 | 1.0 | 2022-06-21 | 2022-06-01 | 2016-01-01 | 303 | 269 | 199299061 | 1 | 1 | 4541 | 5249 | 1332850 | 1833711 | 2815.0 |
| 300186 | 1.0 | 2022-06-21 | 2022-06-01 | 2014-01-01 | 313 | 842 | 29499056 | 3 | 3 | 2105 | 6510 | 1038891 | 2066228 | NaN |
| 300187 | 1.0 | 2022-06-21 | 2022-06-01 | 2016-01-01 | 303 | 251 | 199299090 | 1 | 1 | 6583 | 6750 | 1332850 | 1833711 | 2655.0 |
| 300188 | 1.0 | 2022-06-21 | 2022-06-01 | 2012-01-01 | 303 | 359 | 20117020 | 1 | 1 | 6017 | 7168 | 1487650 | 1939595 | 1330.0 |
| 300189 | 1.0 | 2022-06-21 | 2022-06-01 | 2022-01-01 | 306 | 014 | 149599025 | 1 | 1 | 13 | 13 | 3298 | 3419 | NaN |